fix(openai): Respect 300k token limit for embeddings API requests #33668

KaparthyReddy · 2025-10-25T12:38:27Z

Description

Fixes #31227 - Resolves the issue where OpenAIEmbeddings exceeds OpenAI's 300,000 token per request limit, causing 400 BadRequest errors.

Problem

When embedding large document sets, LangChain would send batches containing more than 300,000 tokens in a single API request, causing this error:

openai.BadRequestError: Error code: 400 - {'error': {'message': 'Requested 673477 tokens, max 300000 tokens per request'}}

The issue occurred because:

The code chunks texts by embedding_ctx_length (8191 tokens per chunk)
Then batches chunks by chunk_size (default 1000 chunks per request)
But didn't check: Total tokens per batch against OpenAI's 300k limit
Result: 1000 chunks × 8191 tokens = 8,191,000 tokens → Exceeds limit!

Solution

This PR implements dynamic batching that respects the 300k token limit:

Added constant: MAX_TOKENS_PER_REQUEST = 300000
Track token counts: Calculate actual tokens for each chunk
Dynamic batching: Instead of fixed chunk_size batches, accumulate chunks until approaching the 300k limit
Applied to both sync and async: Fixed both _get_len_safe_embeddings and _aget_len_safe_embeddings

Changes

Modified langchain_openai/embeddings/base.py:
- Added MAX_TOKENS_PER_REQUEST constant
- Replaced fixed-size batching with token-aware dynamic batching
- Applied to both sync (line ~478) and async (line ~527) methods
Added test in tests/unit_tests/embeddings/test_base.py:
- test_embeddings_respects_token_limit() - Verifies large document sets are properly batched

Testing

All existing tests pass (280 passed, 4 xfailed, 1 xpassed).

New test verifies:

Large document sets (500 texts × 1000 tokens = 500k tokens) are split into multiple API calls
Each API call respects the 300k token limit

Usage

After this fix, users can embed large document sets without errors:

from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma
from langchain_text_splitters import CharacterTextSplitter

# This will now work without exceeding token limits
embeddings = OpenAIEmbeddings()
documents = CharacterTextSplitter().split_documents(large_documents)
Chroma.from_documents(documents, embeddings)

Resolves #31227

- Add strict parameter to ChatDeepSeek class - Switch to Beta API endpoint when strict mode is enabled - Override bind_tools method to add strict: true to tool definitions - Add comprehensive tests for strict mode functionality Resolves langchain-ai#32670

- Add robust fallback for response serialization when model_dump() fails - Use model_dump_json() as fallback for non-OpenAI API responses - Improve null choices error message with debugging information - Add tests for vLLM-style responses and improved error messages Fixes langchain-ai#32252

- Add MAX_TOKENS_PER_REQUEST constant (300,000 tokens) - Implement dynamic batching in _get_len_safe_embeddings to respect token limits - Track actual token counts per chunk and batch accordingly - Apply same fix to async version _aget_len_safe_embeddings - Add test to verify token limit is respected with large document sets Fixes langchain-ai#31227

# Conflicts: # libs/partners/openai/langchain_openai/embeddings/base.py

codspeed-hq · 2025-11-07T20:00:19Z

CodSpeed Performance Report

Merging #33668 will not alter performance

_{Comparing KaparthyReddy:fix/openai-embeddings-token-limit (93e6b72) with master (1bc8802)¹}

Summary

✅ 6 untouched
⏩ 28 skipped²

No successful run was found on master (9bd401a) during the generation of this report, so 1bc8802 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩
28 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

libs/partners/openai/langchain_openai/embeddings/base.py

- Respect 300k token limit for embeddings API requests #33668 - fix create_agent / response_format for Responses API #33939 - fix response.incomplete event is not handled when using stream_mode=['messages'] #33871

Kaparthy Reddy added 3 commits October 18, 2025 20:26

KaparthyReddy requested review from ccurme and mdrxy as code owners October 25, 2025 12:38

github-actions bot added integration Related to a provider partner package integration fix labels Oct 25, 2025

ccurme added 2 commits November 7, 2025 14:45

Merge branch 'master' into fix/openai-embeddings-token-limit

bbe8f7c

# Conflicts: # libs/partners/openai/langchain_openai/embeddings/base.py

revert extraneous changes

15e86cf

Merge branch 'master' into fix/openai-embeddings-token-limit

0735f85

github-actions bot added the openai label Nov 14, 2025

mdrxy added 3 commits November 14, 2025 17:55

lint

6ec71a4

cr

0424976

cr

93e6b72

mdrxy merged commit 2d4f00a into langchain-ai:master Nov 14, 2025
79 checks passed

ccurme reviewed Nov 14, 2025

View reviewed changes

mdrxy mentioned this pull request Nov 14, 2025

release(openai): 1.0.3 #33981

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(openai): Respect 300k token limit for embeddings API requests #33668

fix(openai): Respect 300k token limit for embeddings API requests #33668

KaparthyReddy commented Oct 25, 2025

Uh oh!

codspeed-hq bot commented Nov 7, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix(openai): Respect 300k token limit for embeddings API requests #33668

fix(openai): Respect 300k token limit for embeddings API requests #33668

Conversation

KaparthyReddy commented Oct 25, 2025

Description

Problem

Solution

Changes

Testing

Usage

Uh oh!

codspeed-hq bot commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #33668 will not alter performance

Summary

Footnotes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codspeed-hq bot commented Nov 7, 2025 •

edited

Loading